{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorFlow2教程-Eager Execution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 最全TensorFlow 2.0 入门教程持续更新：https://zhuanlan.zhihu.com/p/59507137\n",
    "> \n",
    "> 完整TensorFlow2.0教程代码请看https://github.com/czy36mengfei/tensorflow2_tutorials_chinese (欢迎star)\n",
    "> \n",
    "> 最新TensorFlow 2教程和相关资源，请关注微信公众号：DoitNLP， 后面我会在DoitNLP上，持续更新深度学习、NLP、Tensorflow的相关教程和前沿资讯，它将成为我们一起学习TensorFlow2的大本营。\n",
    ">\n",
    "> 本教程主要由tensorflow2.0官方教程的个人学习复现笔记整理而来，中文讲解，方便喜欢阅读中文教程的朋友，tensorflow官方教程：https://www.tensorflow.org"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TensorFlow 的 Eager Execution 是一种命令式编程环境，可立即评估操作，无需构建图：操作会返回具体的值，而不是构建以后再运行的计算图。这样可以轻松地使用 TensorFlow 和调试模型，并且还减少了样板代码。\n",
    "\n",
    "Eager Execution 是一个灵活的机器学习平台，用于研究和实验，可提供：\n",
    "\n",
    "- 直观的界面 - 自然地组织代码结构并使用 Python 数据结构。快速迭代小模型和小型数据集。\n",
    "- 更轻松的调试功能 - 直接调用操作以检查正在运行的模型并测试更改。使用标准 Python 调试工具进行即时错误报告。\n",
    "- 自然控制流程 - 使用 Python 控制流程而不是图控制流程，简化了动态模型的规范。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.0.0-beta1\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from __future__ import absolute_import, division, print_function\n",
    "\n",
    "import tensorflow as tf\n",
    "print(tf.__version__)\n",
    "# 在tensorflow2中默认使用Eager Execution\n",
    "tf.executing_eagerly()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.Eager Execution下运算"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在Eager Execution下可以直接进行运算，结果会立即返回"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor([[9.]], shape=(1, 1), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "x = [[3.]]\n",
    "m = tf.matmul(x, x)\n",
    "print(m)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "启用Eager Execution会改变TensorFlow操作的行为 - 现在他们会立即评估并将其值返回给Python。tf.Tensor对象引用具体值，而不再是指向计算图中节点的符号句柄。由于在会话中没有构建和运行的计算图，因此使用print()或调试程序很容易检查结果。评估，打印和检查张量值不会破坏计算梯度的流程。\n",
    "\n",
    "Eager Execution可以与NumPy很好地协作。NumPy操作接受tf.Tensor参数。TensorFlow 数学运算将Python对象和NumPy数组转换为tf.Tensor对象。tf.Tensor.numpy方法将对象的值作为NumPy的ndarray类型返回。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(\n",
      "[[1 9]\n",
      " [3 6]], shape=(2, 2), dtype=int32)\n",
      "tf.Tensor(\n",
      "[[ 3 11]\n",
      " [ 5  8]], shape=(2, 2), dtype=int32)\n",
      "tf.Tensor(\n",
      "[[ 3 99]\n",
      " [15 48]], shape=(2, 2), dtype=int32)\n",
      "[[ 3 99]\n",
      " [15 48]]\n",
      "[[1 9]\n",
      " [3 6]]\n"
     ]
    }
   ],
   "source": [
    "# tf.Tensor对象引用具体值\n",
    "a = tf.constant([[1,9],[3,6]])\n",
    "print(a)\n",
    "\n",
    "# 支持broadcasting（广播：不同shape的数据进行数学运算）\n",
    "b = tf.add(a, 2)\n",
    "print(b)\n",
    "\n",
    "# 支持运算符重载\n",
    "print(a*b)\n",
    "\n",
    "# 可以当做numpy数据使用\n",
    "import numpy as np\n",
    "s = np.multiply(a,b)\n",
    "print(s)\n",
    "\n",
    "# 转换为numpy类型\n",
    "print(a.numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.动态控制流"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Eager Execution的一个主要好处是在执行模型时可以使用宿主语言(Python)的所有功能。所以，例如，写fizzbuzz很容易："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "2\n",
      "Fizz\n",
      "4\n",
      "Buzz\n",
      "Fizz\n",
      "7\n",
      "8\n",
      "Fizz\n",
      "Buzz\n",
      "11\n",
      "Fizz\n",
      "13\n",
      "14\n",
      "FizzBuzz\n",
      "16\n"
     ]
    }
   ],
   "source": [
    "def fizzbuzz(max_num):\n",
    "    counter = tf.constant(0)\n",
    "    max_num = tf.convert_to_tensor(max_num)\n",
    "    # 使用range遍历\n",
    "    for num in range(1, max_num.numpy()+1):\n",
    "        # 重新转为tensor类型\n",
    "        num = tf.constant(num)\n",
    "        # 使用if-elif 做判断\n",
    "        if int(num % 3) == 0 and int(num % 5) == 0:\n",
    "            print('FizzBuzz')\n",
    "        elif int(num % 3) == 0:\n",
    "            print('Fizz')\n",
    "        elif int(num % 5) == 0:\n",
    "            print('Buzz')\n",
    "        else:\n",
    "            print(num.numpy())\n",
    "        counter += 1  # 自加运算\n",
    "fizzbuzz(16)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.在Eager Execution下训练\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 计算梯度\n",
    "自动微分对于实现机器学习算法（例如用于训练神经网络的反向传播）来说是很有用的。在 Eager Execution中，使用 tf.GradientTape 来跟踪操作以便稍后计算梯度。\n",
    "\n",
    "可以用tf.GradientTape来训练和/或计算梯度。它对复杂的训练循环特别有用。\n",
    "\n",
    "由于在每次调用期间可能发生不同的操作，所有前向传递操作都被记录到“磁带”中。要计算梯度，请向反向播放磁带，然后丢弃。特定的tf.GradientTape只能计算一个梯度; 后续调用会引发运行时错误。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor([[2.]], shape=(1, 1), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "w = tf.Variable([[1.0]])\n",
    "# 用tf.GradientTape()记录梯度\n",
    "with tf.GradientTape() as tape:\n",
    "    loss = w*w\n",
    "grad = tape.gradient(loss, w)  # 计算梯度\n",
    "print(grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 训练模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Logits:  [[-0.0330125   0.00557926  0.01437856 -0.07978292 -0.05083258 -0.02875553\n",
      "  -0.02977117 -0.02533271  0.0575003   0.03532691]]\n",
      "........................................"
     ]
    },
    {
     "data": {
      "text/plain": [
       "Text(0, 0.5, 'Loss [entropy]')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 导入mnist数据\n",
    "(mnist_images, mnist_labels), _ = tf.keras.datasets.mnist.load_data()\n",
    "# 数据转换\n",
    "dataset = tf.data.Dataset.from_tensor_slices(\n",
    "  (tf.cast(mnist_images[...,tf.newaxis]/255, tf.float32),\n",
    "   tf.cast(mnist_labels,tf.int64)))\n",
    "# 数据打乱与分批次\n",
    "dataset = dataset.shuffle(1000).batch(32)\n",
    "# 使用Sequential构建一个卷积网络\n",
    "mnist_model = tf.keras.Sequential([\n",
    "  tf.keras.layers.Conv2D(16,[3,3], activation='relu', \n",
    "                         input_shape=(None, None, 1)),\n",
    "  tf.keras.layers.Conv2D(16,[3,3], activation='relu'),\n",
    "  tf.keras.layers.GlobalAveragePooling2D(),\n",
    "  tf.keras.layers.Dense(10)\n",
    "])\n",
    "# 展示数据\n",
    "# 即使没有经过培训，也可以调用模型并在Eager Execution中检查输出\n",
    "for images,labels in dataset.take(1):\n",
    "    print(\"Logits: \", mnist_model(images[0:1]).numpy())\n",
    "    \n",
    "# 优化器与损失函数\n",
    "optimizer = tf.keras.optimizers.Adam()\n",
    "loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n",
    "\n",
    "# 按批次训练\n",
    "# 虽然 keras 模型具有内置训练循环（fit 方法），但有时需要更多自定义设置。下面是一个用 eager 实现的训练循环示例：\n",
    "loss_history = []\n",
    "for (batch, (images, labels)) in enumerate(dataset.take(400)):\n",
    "    if batch % 10 == 0:\n",
    "        print('.', end='')\n",
    "    with tf.GradientTape() as tape:\n",
    "        # 获取预测结果\n",
    "        logits = mnist_model(images, training=True)\n",
    "        # 获取损失\n",
    "        loss_value = loss_object(labels, logits)\n",
    "\n",
    "    loss_history.append(loss_value.numpy().mean())\n",
    "    # 获取本批数据梯度\n",
    "    grads = tape.gradient(loss_value, mnist_model.trainable_variables)\n",
    "    # 反向传播优化\n",
    "    optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))\n",
    "    \n",
    "# 绘图展示loss变化\n",
    "import matplotlib.pyplot as plt\n",
    "plt.plot(loss_history)\n",
    "plt.xlabel('Batch #')\n",
    "plt.ylabel('Loss [entropy]')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.变量求导优化"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "tf.Variable对象存储在训练期间访问的可变tf.Tensor值，以使自动微分更容易。模型的参数可以作为变量封装在类中。\n",
    "\n",
    "将tf.Variable 和tf.GradientTape 结合，可以更好地封装模型参数。例如，可以重写上面的自动微分示例为："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Initial loss: 69.425\n",
      "Loss at step 000: 66.713\n",
      "Loss at step 020: 30.274\n",
      "Loss at step 040: 14.048\n",
      "Loss at step 060: 6.822\n",
      "Loss at step 080: 3.604\n",
      "Loss at step 100: 2.171\n",
      "Loss at step 120: 1.533\n",
      "Loss at step 140: 1.249\n",
      "Loss at step 160: 1.122\n",
      "Loss at step 180: 1.066\n",
      "Loss at step 200: 1.040\n",
      "Loss at step 220: 1.029\n",
      "Loss at step 240: 1.024\n",
      "Loss at step 260: 1.022\n",
      "Loss at step 280: 1.021\n",
      "Final loss: 1.021\n",
      "W = 2.9795801639556885, B = 2.0041041374206543\n"
     ]
    }
   ],
   "source": [
    "class MyModel(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super(MyModel, self).__init__()\n",
    "        self.W = tf.Variable(5., name='weight')\n",
    "        self.B = tf.Variable(10., name='bias')\n",
    "    def call(self, inputs):\n",
    "        return inputs * self.W + self.B\n",
    "\n",
    "# 满足函数3 * x + 2的数据\n",
    "NUM_EXAMPLES = 2000\n",
    "training_inputs = tf.random.normal([NUM_EXAMPLES])\n",
    "noise = tf.random.normal([NUM_EXAMPLES])\n",
    "training_outputs = training_inputs * 3 + 2 + noise\n",
    "\n",
    "# 损失函数\n",
    "def loss(model, inputs, targets):\n",
    "    error = model(inputs) - targets\n",
    "    return tf.reduce_mean(tf.square(error))\n",
    "\n",
    "# 梯度函数\n",
    "def grad(model, inputs, targets):\n",
    "    with tf.GradientTape() as tape:\n",
    "        loss_value = loss(model, inputs, targets)\n",
    "    return tape.gradient(loss_value, [model.W, model.B])\n",
    "\n",
    "# 模型与优化器\n",
    "model = MyModel()\n",
    "optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)\n",
    "\n",
    "print(\"Initial loss: {:.3f}\".format(loss(model, training_inputs, training_outputs)))\n",
    "\n",
    "# 训练循环， 反向传播优化\n",
    "for i in range(300):\n",
    "    grads = grad(model, training_inputs, training_outputs)\n",
    "    optimizer.apply_gradients(zip(grads, [model.W, model.B]))\n",
    "    if i % 20 == 0:\n",
    "        print(\"Loss at step {:03d}: {:.3f}\".format(i, loss(model, training_inputs, training_outputs)))\n",
    "\n",
    "print(\"Final loss: {:.3f}\".format(loss(model, training_inputs, training_outputs)))\n",
    "print(\"W = {}, B = {}\".format(model.W.numpy(), model.B.numpy()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.Eager Execution中的对象"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用 Graph Execution 时，程序状态（如变量）存储在全局集合中，它们的生命周期由 tf.Session 对象管理。相反，在 Eager Execution 期间，状态对象的生命周期由其对应的 Python 对象的生命周期决定。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 变量对象\n",
    "变量将持续存在，直到删除对象的最后一个引用，然后变量被删除。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "if tf.test.is_gpu_available():\n",
    "    with tf.device(\"gpu:0\"):\n",
    "        v = tf.Variable(tf.random.normal([1000, 1000]))\n",
    "        v = None  # v no longer takes up GPU memory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 基于对象的保存\n",
    "tf.train.Checkpoint 可以将 tf.Variable 保存到检查点并从中恢复："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>\n"
     ]
    }
   ],
   "source": [
    "# 使用检测点保存变量\n",
    "x = tf.Variable(6.0)\n",
    "checkpoint = tf.train.Checkpoint(x=x)\n",
    "# 变量的改变会同步到检测点\n",
    "x.assign(1.0)\n",
    "checkpoint.save('./ckpt/')\n",
    "# 检测点保存后，变量的改变对检测点无影响\n",
    "x.assign(8.0)\n",
    "checkpoint.restore(tf.train.latest_checkpoint('./ckpt/'))\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "要保存和加载模型，tf.train.Checkpoint 会存储对象的内部状态，而不需要隐藏变量。要记录 model、optimizer 和全局步的状态，可以将它们传递到 tf.train.Checkpoint："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x1496dcd9588>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 模型保持\n",
    "import os\n",
    "model = tf.keras.Sequential([\n",
    "  tf.keras.layers.Conv2D(16,[3,3], activation='relu'),\n",
    "  tf.keras.layers.GlobalAveragePooling2D(),\n",
    "  tf.keras.layers.Dense(10)\n",
    "])\n",
    "optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)\n",
    "checkpoint_dir = './ck_model_dir'\n",
    "if not os.path.exists(checkpoint_dir):\n",
    "    os.makedirs(checkpoint_dir)\n",
    "checkpoint_prefix = os.path.join(checkpoint_dir, \"ckpt\")\n",
    "# 将优化器和模型记录至检测点\n",
    "root = tf.train.Checkpoint(optimizer=optimizer,\n",
    "                           model=model)\n",
    "# 保存检测点\n",
    "root.save(checkpoint_prefix)\n",
    "# 读取检测点\n",
    "root.restore(tf.train.latest_checkpoint(checkpoint_dir))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 面向对象的指标\n",
    "tf.keras.metrics存储为对象。通过将新数据传递给callable来更新度量标准，并使用tf.keras.metrics.result方法检索结果，例如："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(2.5, shape=(), dtype=float32)\n",
      "tf.Tensor(5.5, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "m = tf.keras.metrics.Mean('loss')\n",
    "m(0)\n",
    "m(5)\n",
    "print(m.result())  # => 2.5\n",
    "m([8, 9])\n",
    "print(m.result())  # => 5.5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6.自动微分高级内容"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 动态模型\n",
    "tf.GradientTape 也可用于动态模型。这个回溯线搜索算法示例看起来像普通的 NumPy 代码，除了存在梯度并且可微分，尽管控制流比较复杂："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def line_search_step(fn, init_x, rate=1.0):\n",
    "    with tf.GradientTape() as tape:\n",
    "        # 变量会自动记录，但需要手动观察张量\n",
    "        tape.watch(init_x)\n",
    "        value = fn(init_x)\n",
    "    grad = tape.gradient(value, init_x)\n",
    "    grad_norm = tf.reduce_sum(grad * grad)\n",
    "    init_value = value\n",
    "    while value > init_value - rate * grad_norm:\n",
    "        x = init_x - rate * grad\n",
    "        value = fn(x)\n",
    "        rate /= 2.0\n",
    "    return x, value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 自定义梯度\n",
    "自定义梯度是在 Eager Execution 和 Graph Execution 中覆盖梯度的一种简单方式。在正向函数中，定义相对于输入、输出或中间结果的梯度。例如，下面是在反向传播中截断梯度范数的一种简单方式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "@tf.custom_gradient\n",
    "def clip_gradient_by_norm(x, norm):\n",
    "    y = tf.identity(x)\n",
    "    def grad_fn(dresult):\n",
    "        return [tf.clip_by_norm(dresult, norm), None]\n",
    "    return y, grad_fn"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "自定义梯度可以提供数值稳定的梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5\n",
      "nan\n"
     ]
    }
   ],
   "source": [
    "def log1pexp(x):\n",
    "    return tf.math.log(1 + tf.exp(x))\n",
    "\n",
    "def grad_log1pexp(x):\n",
    "    with tf.GradientTape() as tape:\n",
    "        tape.watch(x)\n",
    "        value = log1pexp(x)\n",
    "    return tape.gradient(value, x)\n",
    "# 梯度计算在x = 0时工作正常。\n",
    "print(grad_log1pexp(tf.constant(0.)).numpy())\n",
    "# 但是，由于数值不稳定，x = 100失败。\n",
    "print(grad_log1pexp(tf.constant(100.)).numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，log1pexp函数可以使用自定义梯度求导进行分析简化。 下面的实现重用了在前向传递期间计算的tf.exp（x）的值 - 通过消除冗余计算使其更有效："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5\n",
      "1.0\n"
     ]
    }
   ],
   "source": [
    "@tf.custom_gradient\n",
    "def log1pexp(x):\n",
    "    e = tf.exp(x)\n",
    "    def grad(dy):\n",
    "        return dy * (1 - 1 / (1 + e))\n",
    "    return tf.math.log(1 + e), grad\n",
    "\n",
    "def grad_log1pexp(x):\n",
    "    with tf.GradientTape() as tape:\n",
    "        tape.watch(x)\n",
    "        value = log1pexp(x)\n",
    "    return tape.gradient(value, x)\n",
    "# 和以前一样，梯度计算在x = 0时工作正常。\n",
    "print(grad_log1pexp(tf.constant(0.)).numpy())\n",
    "# 并且梯度计算也适用于x = 100\n",
    "print(grad_log1pexp(tf.constant(100.)).numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7.使用gpu提升性能"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在 Eager Execution 期间，计算会自动分流到 GPU。如果要控制计算运行的位置，可以将其放在 tf.device('/gpu:0') 块（或 CPU 等效块）中："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time to multiply a (1000, 1000) matrix by itself 200 times:\n",
      "CPU: 1.698425054550171 secs\n",
      "GPU: 0.13264727592468262 secs\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "def measure(x, steps):\n",
    "    # TensorFlow在第一次使用时初始化GPU，其不计入时间。\n",
    "    tf.matmul(x, x)\n",
    "    start = time.time()\n",
    "    for i in range(steps):\n",
    "        x = tf.matmul(x, x)\n",
    "    # tf.matmul可以在完成矩阵乘法之前返回（例如，\n",
    "    # 可以在对CUDA流进行操作之后返回）。 \n",
    "    # 下面的x.numpy（）调用将确保所有已排队\n",
    "    # 的操作都已完成（并且还将结果复制到主机内存，\n",
    "    # 因此我们只包括一些matmul操作时间）\n",
    "    _ = x.numpy()\n",
    "    end = time.time()\n",
    "    return end - start\n",
    "\n",
    "shape = (1000, 1000)\n",
    "steps = 200\n",
    "print(\"Time to multiply a {} matrix by itself {} times:\".format(shape, steps))\n",
    "\n",
    "# 在CPU上运行:\n",
    "with tf.device(\"/cpu:0\"):\n",
    "    print(\"CPU: {} secs\".format(measure(tf.random.normal(shape), steps)))\n",
    "\n",
    "# 在GPU上运行,如果可以的话:\n",
    "if tf.test.is_gpu_available():\n",
    "    with tf.device(\"/gpu:0\"):\n",
    "        print(\"GPU: {} secs\".format(measure(tf.random.normal(shape), steps)))\n",
    "else:\n",
    "    print(\"GPU: not found\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "tf.Tensor对象可以被复制到不同的设备来执行其操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "if tf.test.is_gpu_available():\n",
    "    x = tf.random.normal([10, 10])\n",
    "    # 将tensor对象复制到gpu上\n",
    "    x_gpu0 = x.gpu()\n",
    "    x_cpu = x.cpu()\n",
    "\n",
    "    _ = tf.matmul(x_cpu, x_cpu)    # Runs on CPU\n",
    "    _ = tf.matmul(x_gpu0, x_gpu0)  # Runs on GPU:0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
